A Metadata Lifecycle for Content Analysis
نویسندگان
چکیده
The issue of chaos order in digital information on an Internet scale has been recently raised for many digital projects around the world. Metadata is an emerging approach to improve precision for resource discovery. The aim of this paper is to present a metadata lifecycle with nine components as the basic model to support content analysis and organize digital information within structured and associated context for the digital library. The lifecycle has shown useful to build up several benefits in terms of metadata process for the digital library programme. The benefits include an analytical distribution of metadata types and elements, a relationship-rich approach for content analysis, a context-centric analysis for system integration, a re-examination of workflow, and a two-parallel orientation to metadata standardization. Introduction; Metadata is an emerging approach to organize digital information in a structured manner and support precise retrieval for digital libraries on an extraordinary Internet scale. Although there are many metadata practices in digital libraries, few literatures have been noted about how to choose the right metadata formats for their own projects. This paper aims to introduce a metadata lifecycle developed by the Academia Sinica as a basis of content analysis to serve the functions of choosing right metadata standards for the National Digital Archive Initiative sponsored by the National Science Council in Taiwan. More than ten projects of the Initiative are employed as the case study to elucidate the framework of metadata lifecycle and show the findings. The issue of metadata system design and implementation is related to content of the metadata lifecycle, but is not addressed in this paper. The metadata lifecycle consists of nine parts as follows: interview with content experts, analysis of project requirements and attributes, review of relevant metadata standards and projects, analysis of information requirements, preparation of the metadata requirement specification, evaluation of metadata system and development, preparation of best practice, development of metadata test-bed, and maintenance of metadata service. Definition Some relevant literatures have offered definitions of content analysis. Based on Bos, & Tarnai’s point of view, content analysis is a means of analyzing texts (Bos, & Tarnai, 1999, p. 660). The Writing Center at Colorado State University regarded Ya-ning Chen Shu-jiun Chen Yi-ting Chang Simon C. Lin V-4 content analysis as a research tool used to determine that presence of certain words of concepts within texts or sets of texts (The Writing Center, n.d.). From Stanton’s conceptual perspective, content analysis is thought as a technique that has been around since the beginning of the century of analyzing the content of documents. The term “document” refers to all media: newspaper, diaries, speeches, letters, reports, books, journals, notices, films, photographs, videos, radio, and television programmes (Stanton, 1995, p. 7/2). However, content analysis is a research tool or technique deployed to clarify the content of document for various purposes. A Metadata Lifecycle In an era of digital libraries, metadata is often used to organize information in an order way to support a better resources discovery and retrieval. It is very important to understand content of document prior to applying any specific metadata formats or standards for the digital libraries, so content analysis is essentially required for any digital library projects. According to Stanton’s concept, content analysis can be divided into 5 stages as follows: determine objectives, define unit of analysis, construct categories for analysis, test coding to assess reliability, and conduct analysis (Stanton, 1995, p. 7/2-7/3). Stanton’s conceptual ideas are mainly focused on document analysis for designing a hypermedia system, so it is a kind of computer system approach to analyze document. On the other hand, Hudgins, Agnew, & Brown planned a workflow for a metadata project based on project management perspective. This approach demonstrates ten tasks to manage a metadata project including understanding the entire project, documentation, maximize existing infrastructure, choosing and evaluating the appropriate metadata standard, metadata record design, preliminary testing of workflow, initial staff design, workflow testing at midpoint, workflow testing at project conclusion, reporting results, and conclusion (Hudgins, Agnew, & Brown, 1999, pp. 42-53). Over 20 projects demand for metadata plan and implementation in the Digital Archive Initiative supported by the National Science Council in Taiwan since 2000. In order to achieve a consistent structure for these projects, a metadata lifecycle is designed by the Sinica Metadata Architecture and Research Taskforce (SMART) for this requirement in terms of both project management and content analysis. The metadata lifecycle is composed of nine components and can be triggered once again while a new or change of project requirements for metadata is initiating. These tasks for the lifecycle are conducted by a series of questionnaires and tables. However, these components for the metadata lifecycle are composed by follows: interview with content experts, analysis of project requirements and attributes, review of relevant metadata standards and projects, analysis of information requirements, preparation of A Metadata Lifecycle for Content Analysis in Digital Libraries V-5 the metadata requirement specification, evaluation of metadata system and development, preparation of best practice, development of metadata test-bed, and maintenance of metadata service. Interview with content experts The first step of the metadata lifecycle is to take a face-to-face interview with content experts and to get an overview of their metadata requirements for each content project. Prior to interview session, two tasks are necessary to undertake. First, members of the SMART group have to take a serious examination of project background information based on review of project proposal such as purposes, goals and expected results. Second, the SMART group sends questionnaires to content projects and inquires information about scope, metadata element and structure, legacy record and system, metadata context, expected result for different stages, contact information, and so on. During the interview session, several points need to be clarified as follow: ● Contact information: who is the contact window? Contact information of the project participants. ● Metadata schedule: when metadata are expected to accomplished? ● Metadata scope: how many types of metadata are required for the projects? Such as types of object, person, event, temporal terms control and expression, and geographic name. ● Legacy record and system: basic information about the learning system, including metadata elements, structure, and number of records, storage format, input method and system. In addition, it is useful to understand the advantage and disadvantage of the legacy system. ● Metadata context: Is only one metadata database constructed for this project? Are any other databases required to integrate with this metadata database, like geographic information system (GIS)? ● Metadata role and function: What kind of metadata role is proposed for each project? What kind of function should be achieved by metadata? Such as resources description, discovery, annotation, or content analysis? Analysis of Project Requirements and Attributes A task of comprehensive analysis is required to ensure requirements and attributes of metadata. The project requirements of metadata would be verified in a systematical way after a detailed discussion in interview session. Several agreements should be defined clearly and attained to prepare the related tasks at next session, like metadata schedule, scope, context, and role and function. Certainly, some real examples should be collected to help the SMART members in understanding project goals and Ya-ning Chen Shu-jiun Chen Yi-ting Chang Simon C. Lin V-6 meanings for each element. Review of relevant Metadata standards and projects The most important metadata task is to select an appropriate metadata standard, instead of developing a new standard. In this session, the SMART group takes a serious examination and survey of existing metadata standards and relevant projects, and then offers a comprehensive comparison between standards and metadata requirements of each project to achieve few results. First, current trends and issues related to the implementation of metadata standards or projects around the world can be discovered and provided an advice for practical application and future development. Second, project members could well know what kind of differences is from other similar or homogeneous projects at the same time, and re-arrange the focus of expected project goals. Third, project’s objectives could be segmented into different parts, and the right metadata standard can also be decided. Analysis of Information Requirements In preparation for analysis of information requirements, several works should be undertaken as a basis for analyzing and ascertaining the project requirements. First the definitions and examples of initial metadata elements should be offered and would be clarified after interview and adjustment. Second, the proposed metadata standard and elements list are selected and explained for project participants. Third, the comparison among selected standards and required elements of the project is conducted. Fourth, an analytical context diagram with various relationships for metadata scope and context would be defined. Fifth, indexing keys and access points would be advised as a basis for system design, as well as for metadata role and function. In this session, the benefits are as follows: ● Metadata elements and categories are chosen and defined clearly based on a comparison with an existing metadata standard. ● Distributions of metadata elements are verified in compared with selected metadata standards. These include the distribution of description, administration, system management, and rights management, resources discovery. ● Metadata scope and context are clarified, and related relationships are also drawn a clear line and attributed to a diversity of categories. ● It could ensure what kind of systems and databases are integrated by metadata mechanism such as GIS. ● A crosswalking is accomplished between existing metadata standards and project required metadata elements. ● Real examples and definitions for projects are collected as a basis for the best A Metadata Lifecycle for Content Analysis in Digital Libraries
منابع مشابه
Design and Implementation of a Comprehensive Database of the Written Heritage of Science and Technology
Purpose: This study aims to design and implement a comprehensive database of the written heritage of science and technology in the Regional Information Center for Science and Technology (RICeST) and determine the metadata elements required to describe the manuscripts. Method: This study was carried out by the content analysis method to identify the metadata elements needed to describe the coll...
متن کاملشناسایی روابط کتابشناختی در فهرست کتابخانه ملی ایران مبتنی بر الگوی ملزومات کارکردی پیشینههای کتابشناختی (اف آر بی آر): گام نخست در بازنمون شبکه دانش انتشارات ایرانی-اسلامی
The aim of this study is to find out the bibliographic relationships between the metadata records in the National Library and Archives of Iran (NLAI) according to FRBR model, in order to represent the Knowledge network of Iranian-Islamic publications. To achieve this objective, the content analysis method was used. The study population includes metadata records for books in NLAI for four biblio...
متن کاملFacet Analysis of Archival Metadata Standards to Support Appropriate Selection, Combination and Use of Metadata Schemas
Metadata is one of the keys for digital archiving and preservation. This is well recognized as an important issue in our networked information society. There are several standards for archival and preservation metadata, e.g. ISAD(G), EAD, AGRkMS, PREMIS, and OAIS. This leads to selection and interoperability issues for metadata standards in the design of metadata schemas for particular archival...
متن کاملLiCoRMS - Towards a Resource Management System Based on Lifecycle and Content Information
Knowledge intensive work is characterized by dealing with an increasing amount of resources, like documents or e-mails. To increase efficiency, users frequently reuse existing resources, e.g., create new documents by using existing ones as a template. This paper introduces LiCoRMS, a light-weight system for supporting the reuse of resources by capturing and managing relationships between resour...
متن کاملPredicting and Analyzing Factors in Patent Litigation
Patent litigation is an expensive and time-consuming process. To minimize its impact on the participants in the patent lifecycle, automatic determination of litigation potential is a compelling machine learning application. In this paper, we consider preliminary methods for the prediction of a patent being involved in litigation using metadata, content, and graph features. Metadata features are...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2001